# Multilingual OCR

PP OCRv4 Mobile Det
Apache-2.0
PP-OCRv4_mobile_det is an efficient text detection model optimized for mobile devices developed by the PaddleOCR team, suitable for deployment on edge devices.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
360
0
PP OCRv5 Mobile Rec
Apache-2.0
PP-OCRv5_mobile_rec is the latest generation of text line recognition model developed by the PaddleOCR team. It supports the recognition of four languages: Simplified Chinese, Traditional Chinese, English, and Japanese, and is suitable for various complex text scenarios.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
499
0
PP OCRv5 Server Rec
Apache-2.0
PP-OCRv5_server_rec is the latest generation of text line recognition model developed by the PaddleOCR team, supporting the recognition of multilingual and complex text scenarios.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
8,601
0
Florence Base Mixed Line Bbox Ocr
MIT
An image-to-text model fine-tuned based on Microsoft Florence-2 foundation model, supporting Swedish and English, specializing in historical handwritten text recognition and optical character recognition.
Image-to-Text Safetensors
F
nazounoryuu
112
0
Mistral Small 1
MIT
An image text-to-text model built on Mistral-Small-3.1-24B-Instruct-2503, supporting multilingual processing
Image-to-Text Safetensors Supports Multiple Languages
M
CreitinGameplays
109
1
Internvl3 2B AWQ
Other
InternVL3-2B is an advanced Multimodal Large Language Model (MLLM) developed by OpenGVLab, featuring exceptional multimodal perception and reasoning capabilities, supporting tool usage, GUI agents, industrial image analysis, 3D visual perception, and more.
Transformers Other
I
OpenGVLab
677
1
Paligemma2 3b Mix 224 Jax
PaliGemma 2 is an upgraded vision-language model based on Gemma 2, supporting multilingual image-text input and text output, specifically designed for vision-language tasks
Text-to-Image
P
google
38
1
Minicpm O 2 6 Int4
The int4 quantized version of MiniCPM-o 2.6, significantly reducing GPU VRAM usage while supporting multimodal processing capabilities.
Text-to-Audio Transformers Other
M
openbmb
4,249
42
Paligemma2 28b Mix 224
PaliGemma 2 is an upgraded vision-language model launched by Google, combining the capabilities of Gemma 2 and SigLIP vision models, supporting multilingual image-text interaction tasks.
Image-to-Text Transformers
P
google
2,050
4
Paligemma2 28b Mix 448
PaliGemma 2 is a vision-language model based on Gemma 2, supporting image+text input and text output, suitable for various vision-language tasks.
Image-to-Text Transformers
P
google
198
26
Paligemma2 10b Mix 224
PaliGemma 2 is a vision-language model based on Gemma 2, supporting image and text input to generate text output, suitable for various vision-language tasks.
Image-to-Text Transformers
P
google
701
7
Paligemma2 3b Mix 448
PaliGemma 2 is a vision-language model based on Gemma 2, supporting image and text inputs with text generation output, suitable for various vision-language tasks.
Image-to-Text Transformers
P
google
20.55k
44
Trocr Nepali
A Devanagari optical character recognition model based on the TrOCR architecture, specifically fine-tuned for Nepali/Devanagari script
Text Recognition Transformers Other
T
syubraj
175
0
Thai Trocr
Apache-2.0
A Thai and English optical character recognition model fine-tuned from the TrOCR base handwriting model, excelling in processing handwritten text line images
Text Recognition Transformers Supports Multiple Languages
T
openthaigpt
2,677
9
Urdu Ocr
This model is specifically trained for Urdu OCR tasks and is most suitable for processing single-line Urdu text images, primarily focusing on printed text.
Text Recognition Transformers Other
U
cxfajar197
114
1
Trocr Medieval Cursiva
MIT
This is a TrOCR-based medieval cursive script recognition model, specifically designed for identifying handwritten texts in Latin, French, Italian, Spanish, and Catalan from the medieval period.
Text Recognition Transformers Supports Multiple Languages
T
medieval-data
18
1
Trocr Base Ru
Apache-2.0
TrOCR-Ru is an optical character recognition model fine-tuned on synthetic datasets of Russian and English, based on microsoft/trocr-base-handwritten, focusing on image-to-text tasks.
Text Recognition Transformers Supports Multiple Languages
T
sherstpasha99
30
0
Trocr Base Finetune Numbers
TrOCR is a Transformer-based optical character recognition model designed to extract text content from images.
Image-to-Text Transformers English
T
ANANDHU-SCT
23
0
Trocr Base Ckb
An OCR system based on Transformer architecture, specifically designed for recognizing Central Kurdish text, trained using synthetic data.
Text Recognition Transformers
T
razhan
19
0
Pix2struct Ocrvqa Base
Apache-2.0
Pix2Struct is a visual question answering model fine-tuned for OCR-VQA tasks, capable of parsing textual content in images and answering questions
Image-to-Text Transformers Supports Multiple Languages
P
google
38
1
Pix2struct Docvqa Base
Apache-2.0
Pix2Struct is an image encoder-text decoder model trained on image-text pairs, supporting various tasks including image captioning and visual question answering.
Image-to-Text Transformers Supports Multiple Languages
P
google
8,601
37
Pix2struct Chartqa Base
Apache-2.0
Pix2Struct is an image encoder-text decoder model trained on image-text pairs for multitasking, specifically fine-tuned for chart question answering tasks
Text-to-Image Transformers Supports Multiple Languages
P
google
181
8
Donut Base Finetuned Latvian Receipts
MIT
This model is a fine-tuned version of donut-base on a Latvian receipt dataset, primarily used for receipt image processing tasks
Text Recognition Transformers
D
Inesence
31
0
Doctr Torch Crnn Mobilenet V3 Large French
An optical character recognition (OCR) model based on TensorFlow 2 and PyTorch, supporting multilingual text detection and recognition
Text Recognition Transformers Supports Multiple Languages
D
Felix92
33
3
Doctr Tf Crnn Vgg16 Bn French
Optical Character Recognition technology based on TensorFlow 2 and PyTorch, supporting multilingual document recognition
Text Recognition Transformers Supports Multiple Languages
D
Felix92
16
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase